"Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model - Application to Informed Audio Source Separation
نویسندگان
چکیده
Sparse representations have proved a very useful tool in a variety of domain, e.g. speech/music source separation. As strictly sparse representations (in the sense of l) are often impossible to achieve, other ways of studying signals sparsity have been proposed. In this paper, we revisit the irrelevance filtering analysis-synthesis approach proposed in (Balazs et al., IEEE Trans. ASLP, 18(1), 2010), where the TF coefficients that are below some masking threshold are set to zero. Instead of using the Gabor transform and a specific psychoacoustic model, we use tools directly inspired from perceptual audio coding, for instance MPEG-AAC. We show that significantly better “sparsification performances” are obtained on music signals, at lower computational cost. We then apply the sparsification process to the informed source separation (ISS) problem and show that it enables to significantly decrease the computational cost at the ISS decoder.
منابع مشابه
A High-rate Data Hiding Technique for Audio Signals Based on Intmdct Quantization
Data hiding consists in hiding/embedding binary information within a signal in an imperceptible way. In this study we propose a high-rate data hiding technique suitable for uncompressed audio signals (PCM as used in Audio-CD and .wav format). This technique is appropriate for non-securitary applications, such as enriched-content applications, that require a large bitrate but no particular robus...
متن کاملLossless Audio Compression Using Integer Modified Discrete Cosine Transform
Recently, an MPEG2 AAC [1] based lossless audio codec with the Integer MDCT (IntMDCT) was proposed [2]. The IntMDCT was constructed by lifting scheme [3] to hold the perfect reconstruction(PR). In this paper, we will evaluate the IntMDCT implemented by fixed-point arithmetic with quantized lifting coefficients in the MPEG2 AAC based lossless audio coding. The results indicate that there exists ...
متن کاملA high-capacity watermarking technique for audio signals based on MDCT-domain quantization
Watermarking is a technique that consists in hiding/embedding binary information within a signal in an imperceptibly way, meaning in the present context of audio signals that the mark is inaudible. Watermarking was first used for the protection of digital contents as part of the DRM (Digital Rights Management). In this context of secured applications, important efforts were devoted to ensure ro...
متن کاملFine grain scalable perceptual and lossless audio coding based on IntMDCT
This papers presents an embedded fine grain scalable perceptual and lossless audio coding scheme. The enabling technology for this combined perceptual and lossless audio coding approach is the Integer Modified Discrete Cosine Transform (IntMDCT), which is an integer approximation of the MDCT based on the lifting scheme. It maintains the perfect reconstruction property and therefore enables effi...
متن کاملA Study of the Effect of Source Sparsity for Various Transforms on Blind Audio Source Separation Performance
In this paper, the problem of blind separation of underdetermined noisy mixtures of audio sources is considered. The sources are assumed to be sparsely represented in a transform domain. The sparsity of their analysis coefficients is modelled by the Student t distribution. This prior allows for robust Bayesian estimation of the sources, the mixing matrix, the additive noise variance as well as ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011